Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized Discounting

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized Discounting

In this paper we study a Markov decision process with a non-linear discount function. Our approach is in spirit of the von Neumann-Morgenstern concept and is based on the notion of expectation. First, we define a utility on the space of trajectories of the process in the finite and infinite time horizon and then take their expected values. It turns out that the associated optimization problem l...

متن کامل

Regular Policies in Stochastic Optimal Control and Abstract Dynamic Programming

Notation Connection with Abstract DPMapping of a stationary policy μ: For any control function μ, with μ(x) ∈ U(x) forall x , and J ∈ E(X ) define the mapping Tμ : E(X ) 7→ E(X ) by(TμJ)(x) = E{g(x , μ(x),w) + αJ(f (x , μ(x),w))}, x ∈ XValue Iteration mapping: For any J ∈ E(X ) define the mapping T : E(X ) 7→ E(X )(TJ)(x) = infu∈U(x)E{...

متن کامل

Convergence of Sample Path Optimal Policies for Stochastic Dynamic Programming

We consider the solution of stochastic dynamic programs using sample path estimates. Applying the theory of large deviations, we derive probability error bounds associated with the convergence of the estimated optimal policy to the true optimal policy, for finite horizon problems. These bounds decay at an exponential rate, in contrast with the usual canonical (inverse) square root rate associat...

متن کامل

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics of Operations Research

سال: 2013

ISSN: 0364-765X,1526-5471

DOI: 10.1287/moor.1120.0561